智能论文笔记

Intellectual Property Evaluation Utilizing Machine Learning

Jinxin Ding , Yuxin Huang , Keyang Ni , Xueyao Wang , Yinxiao Wang , Yucheng Wang

分类：人工智能

2022-08-18

智力特性在经济发展中越来越重要。为了通过IP评估中的传统方法来解决疼痛点，我们正在以机器学习为核心开发一项新技术。我们已经建立了一个在线平台，并将在大湾地区扩展我们的业务。

translated by 谷歌翻译

Learning from Mixed Datasets: A Monotonic Image Quality Assessment Model

Zhaopeng Feng , Keyang Zhang , Baoliang Chen , Shiqi Wang

分类：计算机视觉 | 机器学习

2022-09-21

基于深度学习的图像质量评估（IQA）模型通常会学会从单个数据集中预测图像质量，从而导致该模型过度适合特定的场景。为此，混合的数据集培训可以是增强模型概括能力的有效方法。但是，将不同的iQA数据集组合在一起是无聊的，因为它们的质量评估标准，评分范围，视图条件以及在图像质量注释期间通常不共享主题。在本文中，我们没有对注释对准注释，而是为IQA模型学习提供了一个单调的神经网络，其中包括不同的数据集。特别是，我们的模型由数据集共享的质量回归器和几个特定于数据集的质量变压器组成。质量回归器旨在获得每个数据集的感知质量，而每个质量变压器则将感知质量映射到相应的数据集注释及其单调性。实验结果验证了提出的学习策略的有效性，我们的代码可在https://github.com/fzp0424/monotoniciqa上获得。

translated by 谷歌翻译

Grasping Core Rules of Time Series through Pure Models

Gedi Liu , Yifeng Jiang , Yi Ouyang , Keyang Zhong , Yang Wang

分类：机器学习 | 人工智能 | (统计)机器学习

2022-08-15

时间序列与许多其他机器学习领域一样，从统计学到深度学习进行了过渡。尽管随着模型在许多公开可用的数据集中的更新时，似乎精度一直在提高，但通常只会将比例尺增加几倍，以换取准确性的略有差异。通过该实验，我们指出了不同的思维方式，时间序列，尤其是长期预测，可能与其他领域有所不同。不必使用广泛而复杂的模型来掌握时间序列的所有方面，而是使用纯模型来掌握时间序列的核心规则。有了这个简单但有效的想法，我们创建了Purets，这是一个具有三个纯线性层的网络，在80％的长序列预测任务中实现了最新的，同时几乎是最轻的模型，并且运行速度最快。在此基础上，我们讨论了纯线性层在现象和本质中的潜力。理解核心法律的能力有助于长距离预测的高精度，并且合理的波动可以防止其扭曲多步预测中的曲线，例如主流深度学习模型，该模型总结为纯粹的线性神经网络，避免了范围 - 覆盖。最后，我们建议轻巧长时间时间序列任务的基本设计标准：输入和输出应尝试具有相同的维度，并且结构避免了碎片化和复杂的操作。

translated by 谷歌翻译

TOCH: Spatio-Temporal Object-to-Hand Correspondence for Motion Refinement

Keyang Zhou , Bharat Lal Bhatnagar , Jan Eric Lenssen , Gerard Pons-Moll

分类：计算机视觉

2022-05-16

我们提出了TOCH，这是一种使用数据先验来完善不正确的3D手对象交互序列的方法。现有的手动跟踪器，尤其是那些依靠很少相机的手动跟踪器，通常会通过手动相交或缺失的触点产生视觉上不切实际的结果。尽管纠正此类错误需要有关交互的时间方面的推理，但大多数以前的作品都集中在静态抓取和触点上。我们方法的核心是Toch Fields，这是一种新颖的时空表示，用于在交互过程中建模手和物体之间的对应关系。 Toch字段是一个以对象为中心的表示，它相对于对象编码手的位置。利用这种新颖的表示，我们学习了具有暂时性的自动编码器的合理象征领域的潜在流形。实验表明，Toch优于最先进的3D手动相互作用模型，这些模型仅限于静态抓取和触点。更重要的是，我们的方法甚至在接触之前和之后都会产生平滑的相互作用。使用单个训练有素的TOCH模型，我们定量和定性地证明了其有用性，可用于纠正现成的RGB/RGB/RGB-D手动重建方法，并跨对象传输grasps。

translated by 谷歌翻译

Look, Cast and Mold: Learning 3D Shape Manifold from Single-view Synthetic Data

Qianyu Feng , Yawei Luo , Keyang Luo , Yi Yang

分类：计算机视觉

2021-03-08

推断现实世界中物体的立体结构是一项具有挑战性但实用的任务。为了配备深层模型，通常需要大量的3D监督，这很难获得。有希望的是，我们可以简单地从合成数据中受益，其中成对地面真相很容易访问。然而，考虑到变体的纹理，形状和上下文，域间隙并非平凡。为了克服这些困难，我们提出了一个称为VPAN的单视3D重建的粘性感知自适应网络。为了将模型概括为真实的场景，我们建议实现几个方面：（1）外观：视觉上从单个视图中纳入空间结构，以增强表示表示的表现力；（2）铸造：在感知上将2D图像特征与具有跨模式语义对比度映射的3D形状先验对齐；（3）模具：通过将嵌入到所需的歧管中来重建目标的立体形状。对几个基准测试的广泛实验证明了拟议方法通过单视图从合成数据学习3D形状的歧管的有效性和鲁棒性。所提出的方法优于iOU 0.292和cd 0.108上的Pix3D数据集上的最先进的方法，并在Pascal 3D+上达到0.329和CD 0.104。

translated by 谷歌翻译

Multimodal Machine Learning for Automated ICD Coding

Keyang Xu , Mike Lam , Jingzhi Pang , Xin Gao , Charlotte Band , Piyush Mathur , Frank Papay , Ashish K. Khanna , Jacek B. Cywinski , Kamal Maheshwari

分类：机器学习 | (统计)机器学习

2018-10-31

这项研究提出了一个多模式的机器学习模型，以预测ICD-10诊断代码。我们开发了单独的机器学习模型，可以处理来自不同模式的数据，包括非结构化文本，半结构化文本和结构化表格数据。我们进一步采用了合奏方法来集成所有模式特异性模型以生成ICD-10代码。还提取了主要证据，以使我们的预测更具说服力和可解释。我们使用医学信息集市进行重症监护III（模拟-III）数据集来验证我们的方法。对于ICD代码预测，我们的表现最佳模型（Micro-F1 = 0.7633，Micro-AUC = 0.9541）显着超过其他基线模型，包括TF-IDF（Micro-F1 = 0.6721，Micro-AUC = 0.7879）和Text-CNN模型（Micro-F1 = 0.6569，Micro-AUC = 0.9235）。为了解释性，我们的方法在文本数据上实现了JACCARD相似性系数（JSC）为0.1806，在表格数据上分别获得了0.3105，训练有素的医生分别达到0.2780和0.5002。

translated by 谷歌翻译

Translating Text Synopses to Video Storyboards

Xu Gu , Yuchong Sun , Feiyue Ni , Shizhe Chen , Ruihua Song , Boyuan Li , Xiang Cao

分类：计算机视觉

2022-12-31

A storyboard is a roadmap for video creation which consists of shot-by-shot images to visualize key plots in a text synopsis. Creating video storyboards however remains challenging which not only requires association between high-level texts and images, but also demands for long-term reasoning to make transitions smooth across shots. In this paper, we propose a new task called Text synopsis to Video Storyboard (TeViS) which aims to retrieve an ordered sequence of images to visualize the text synopsis. We construct a MovieNet-TeViS benchmark based on the public MovieNet dataset. It contains 10K text synopses each paired with keyframes that are manually selected from corresponding movies by considering both relevance and cinematic coherence. We also present an encoder-decoder baseline for the task. The model uses a pretrained vision-and-language model to improve high-level text-image matching. To improve coherence in long-term shots, we further propose to pre-train the decoder on large-scale movie frames without text. Experimental results demonstrate that our proposed model significantly outperforms other models to create text-relevant and coherent storyboards. Nevertheless, there is still a large gap compared to human performance suggesting room for promising future work.

translated by 谷歌翻译

Certified Policy Smoothing for Cooperative Multi-Agent Reinforcement Learning

Ronghui Mu , Wenjie Ruan , Leandro Soriano Marcolino , Gaojie Jin , Qiang Ni

分类：机器学习

2022-12-22

Cooperative multi-agent reinforcement learning (c-MARL) is widely applied in safety-critical scenarios, thus the analysis of robustness for c-MARL models is profoundly important. However, robustness certification for c-MARLs has not yet been explored in the community. In this paper, we propose a novel certification method, which is the first work to leverage a scalable approach for c-MARLs to determine actions with guaranteed certified bounds. c-MARL certification poses two key challenges compared with single-agent systems: (i) the accumulated uncertainty as the number of agents increases; (ii) the potential lack of impact when changing the action of a single agent into a global team reward. These challenges prevent us from directly using existing algorithms. Hence, we employ the false discovery rate (FDR) controlling procedure considering the importance of each agent to certify per-state robustness and propose a tree-search-based algorithm to find a lower bound of the global reward under the minimal certified perturbation. As our method is general, it can also be applied in single-agent environments. We empirically show that our certification bounds are much tighter than state-of-the-art RL certification solutions. We also run experiments on two popular c-MARL algorithms: QMIX and VDN, in two different environments, with two and four agents. The experimental results show that our method produces meaningful guaranteed robustness for all models and environments. Our tool CertifyCMARL is available at https://github.com/TrustAI/CertifyCMA

translated by 谷歌翻译

HYRR: Hybrid Infused Reranking for Passage Retrieval

Jing Lu , Keith Hall , Ji Ma , Jianmo Ni

分类：自然语言处理

2022-12-20

We present Hybrid Infused Reranking for Passages Retrieval (HYRR), a framework for training rerankers based on a hybrid of BM25 and neural retrieval models. Retrievers based on hybrid models have been shown to outperform both BM25 and neural models alone. Our approach exploits this improved performance when training a reranker, leading to a robust reranking model. The reranker, a cross-attention neural model, is shown to be robust to different first-stage retrieval systems, achieving better performance than rerankers simply trained upon the first-stage retrievers in the multi-stage systems. We present evaluations on a supervised passage retrieval task using MS MARCO and zero-shot retrieval tasks using BEIR. The empirical results show strong performance on both evaluations.

translated by 谷歌翻译

An Efficient Drug-Drug Interactions Prediction Technology for Molecularly Intelligent Manufacturing

Peng Gao , Feng Gao , Jian-Cheng Ni , Hamido Fujita

分类：自然语言处理

2022-12-19

Drug-Drug Interactions (DDIs) prediction is an essential issue in the molecular field. Traditional methods of observing DDIs in medical experiments require plenty of resources and labor. In this paper, we present a computational model dubbed MedKGQA based on Graph Neural Networks to automatically predict the DDIs after reading multiple medical documents in the form of multi-hop machine reading comprehension. We introduced a knowledge fusion system to obtain the complete nature of drugs and proteins and exploited a graph reasoning system to infer the drugs and proteins contained in the documents. Our model significantly improves the performance compared to previous state-of-the-art models on the QANGAROO MedHop dataset, which obtained a 4.5% improvement in terms of DDIs prediction accuracy.

translated by 谷歌翻译